Publication Updated on May 21 2020

Plotting a moving average of new deaths per day for all 50 states, DC, and a national total. The number and color refer to magnitude of deaths for the day on which data is most recently available (May 19 2020).

Goals and Inspiration

In the context of the 2020 Coronavirus pandemic, data presentation has an underspoken role in guiding both public perception and policy. Concerned Americans are filled with comparative questions which are incredibly difficult to answer with tabular data. Are less people dying? Is my state faring worse than others? Is the recovery similar across all states? The visualizations that we use to translate the data are critical towards understanding and making these comparisons. To begin with, the figure below is a reproduction of the chart displayed at many recent White House press briefings. This chart places the daily cumulative total of deaths for all 50 states on a shared set of axis. This is the bare minimum of data visualization, and its design actually obfuscates important information.

Addressing these Gaps

The issues above can be addressed with two main changes to the graphic:

  1. Introducing interactability to the graphic so that it can be filtered and hoverable by the viewer. Ideally, the viewer should be able to highlight groups within the graphic for comparison.

  2. Produce, without clutter, metrics other than Cumulative Deaths, so that the nuances of the changes in coronavirus fatality can be fully understood.

By utilizing the Plotly front-end user interface, this can be accomplished relatively simply. Plotly enables for interactive plots from R, with excellent features for filtering, panning, and making comparisons on data in shared x-values. This address Change 1. By placing Cumulative Deaths, New Deaths per Day, and New Deaths per Day per 100,000 on a shared axis of Date, we accomplish Change 2.

Features of the Plot:

Using the Plot

The above rework is largely sufficient for answering the questions most common in coronavirus discourse. From the three area comparison above, we can answer our original questions. The Daily Deaths chart shows us the number of national daily deaths are trending downwards. The Daily Deaths per 100k People shows us that the fatalatity of New York is above the national average when adjusted for population, and that California is below the national average. Double click any state to removing all filters, and then single-click Nationwide to remove the national total. The Daily Deaths per 100k People chart in the most recent dates shows us that three states are actually getting worse in proportion to their population: New Jersey, Connecticut, and Massachusetts. Filter to these states to explore more.

Going Further

I was also interested in taking complete creative license with the data in order to explore novel visualizations. Rather than constraining the x-axis to time, which marches along in an entirely predicatable linear fashion, why not provide the x-axis with a more interesting variable. I explored the idea of a bar chart, representing Cumulative Deaths for a region, that was filled as time went by. The idea floundered in rendering. Instead, a dot plot, where each dot is the cumulative total for that day. That way, the distance between dots would indicate the speed by which the cumulative total is increasing. The result:

Using the Plot

1. Overall Layer

The default view acts as a horizontal bar chart. States are plotted by population, but as population remains constant, each state occupies only one row of the chart. The horizontal length of each state represents the Cumulative Deaths at the most recent date. Of course, the Nationwide Cumulative Deaths and Population exceed any state. As noted previously, each dot represents a new day, so large horizontal distance between dates represent a large number of deaths for that day. We can filter out national totals to see the states more clearly by single-clicking on Nationwide.

2. All States Layer

Filtered down to states, we can see how the states are distributed by population. This enables appropriate comparisons and identification of trends. For example, New York has the largest number of Cumulative Deaths, but is the 4th largest in terms of population, making it a true outlier in the impact of Coronavirus. This inbalance is also true for New Jersey. Let’s filter to two states for comparison by double-clicking on Michigan and then single-clicking on New Jersey. With similar populations, it is more reasonable to compare these states.

Let’s activate the Compare data on hover tool in the upper right of the plot. By placing the cursor in the space between the two lines of dots, we can see at what date the two states had the same number of cumulative dates. For exampl,e by hovering over the 4,000 death mark, we can see that on 5/2/2020, Michigan reached 4,020 cumulative deaths, and that New Jersey had reached 4,080 deaths back on 4/18/20. It would be interesting for policy makers in Michigan to therefor look at the policy decisions made in New Jersey back on 4/18 and see if there was anything to be learned.

3. Single State Layer

Now, let’s isolate a single state - New York. Double-click to isolate the state. We now see a very different picture. The dots still align with the cumulative total for a given date, but we have two new pieces of information plotted on the chart.

Taken together, this view gives us a sense of the scale of deaths (Cumulative Deaths), the rate of increase in those Cumulative Deaths (the shaded line representing deaths OR the size of the gap between dots), and the change in the deaths per day over time (the solid line OR the vertical change in the deaths per day lines)

For New york, we can see that the deaths per day were increasing until about 4/17/2020, quickly at first and then more slowly. On 4/17, the deaths per day began a steady decrease, reflected by both the downwards solid lines, and the smaller shaded lines. A particularly deadly day on 5/6 is the exception of a much-improving situation in New York over the past 2 weeks, with an average decrease of 11 new deaths per day over the 2 weeks ending on 5/14.

Conclusion

Hopefully, this paper has given the reader some new insights in what they can and should expect from modern Data Visualizations. As Edward Tufte puts it, the guiding principle for design is thus:

Graphical Excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.

I have tried my best to reimagine Coronavirus visualizations in the context of this advice, and would love to hear what you like, and especially what you don’t like, about my efforts. Please contact me at to share whatever feedback or questions you have.